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We study nonparametric estimation of the sub-distribution func- 
tions for current status data with competing risks. Our main interest 
is in the nonparametric maximum likehhood estimator (MLE), and 
for comparison we also consider a simpler "naive estimator." Both 
types of estimators were studied by Jewell, van der Laan and Henne- 
man [Biometrika (2003) 90 183-197], but little was known about their 
large sample properties. We have started to fill this gap, by proving 
that the estimators are consistent and converge globally and locally 
at rate n^^^ . We also show that this local rate of convergence is opti- 
mal in a minimax sense. The proof of the local rate of convergence of 
the MLE uses new methods, and relies on a rate result for the sum of 
the MLEs of the sub-distribution functions which holds uniformly on 
a fixed neighborhood of a point. Our results are used in Groeneboom, 
Maathuis and Wellner [Ann. Statist. (2008) 36 1064-1089] to obtain 
the local limiting distributions of the estimators. 

1. Introduction. We study current status data with competing risks. 
Such data arise naturally in cross-sectional studies with several failure causes. 
Moreover, generalizations of these data arise in HIV vaccine trials (see [5]). 
The general framework is as follows. We analyze a system that can fail 
from K competing risks, where E N is fixed. The random variables of 
interest are {X, Y) , where X £ M is the failure time of the system, and 
Y £ {I, . . . , K} is the corresponding failure cause. We cannot observe {X, Y) 
directly. Rather, we observe the "current status" of the system at a single 
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random time T G M, where T is independent of {X,Y). This means that at 
time T, we observe whether or not failure occurred, and if and only if failure 
occurred, we also observe the failure cause Y. 

We want to estimate the bivariate distribution of {X,Y). Since Y £ 
{1, . . . ,K}, this is equivalent to estimating the sub-distribution functions 
Fofc(s) = P{X < s,Y = k), k = 1,...,K. Note that the sum of the sub- 
distribution functions J2k=i^ok{s) = P{X < s) is the overall failure time 
distribution. This shows that the sub-distribution functions are related to 
each other and should be considered as a system. 

We consider nonparametric estimation of the sub-distribution functions. 
This problem, or close variants thereof, has been studied by [5, 6, 7]. These 
papers introduced various nonparametric estimators, including the MLE (see 
[5, 7]) and a "naive estimator" (see [7]). They also provided algorithms to 
compute the estimators, and showed simulation studies that compared them. 
However, until now, little was known about the large sample properties of 
the estimators. 

We have started to fill this gap by developing the local asymptotic theory 
for the MLE and the naive estimator. We study the MLE because it is 
a natural estimator that often exhibits good behavior. The simpler naive 
estimator was suggested to be asymptotically efficient for the estimation of 
smooth functionals [7], and we therefore consider it for comparison. In the 
present paper we prove consistency and rates of convergence. These results 
are used in [3] to obtain the local limiting distributions. 

The outline of this paper is as follows. In Section 2 we introduce the esti- 
mators. We discuss their definitions, give existence and uniqueness results, 
and provide various characterizations in terms of necessary and sufficient 
conditions. Such characterizations are important since there is no closed 
form available for the MLE. In Section 3 we show that the estimators are 
globally and locally consistent. In Section 4 we prove that their global and 
local rates of convergence are n^^^ (Theorems 4.1 and 4.17). We also prove 
that n^/^ is an asymptotic local minimax lower bound for the rate of con- 
vergence (Proposition 4.4). Hence, the estimators converge locally at the 
optimal rate, in a minimax sense. The proof of the local rate of convergence 
of the MLE uses new methods. One of the main difficulties in this proof 
consists of handling the system of sub-distribution functions. We solve this 
problem by first deriving a rate result for the sum of the MLEs of the 
sub-distribution functions (Theorem 4.10). This rate result is stronger than 
usual, since it holds uniformly on a fixed neighborhood of a point, instead 
of on a shrinking neighborhood of order n~^/^ (see Remark 4.11). Such a 
strong result is needed to handle potential sparsity of the jump points of the 
MLEs of the sub-distribution functions (see Remark 4.18). Technical proofs 
are collected in Section 5, and computational aspects of the estimators are 
discussed in the companion paper [3], Section 4. 
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Fig. 1. Graphical representation of the observed data (T, A) m an example with K — 3 
competing risks. The black sets indicate the values of{X,Y) that are consistent with (T, A), 
for each of the four possible values of A. 



2. The estimators. We make the fohowing assumptions: (a) the obser- 
vation time T is independent of the variables of interest {X,Y), and (b) the 
system cannot fail from two or more causes at the same time. Assumption 
(a) is essential for the development of the theory. Assumption (b) ensures 
that the failure cause is well defined. This assumption is always satisfied by 
defining simultaneous failure from several causes as a new failure cause. We 
allow ties in the observation times. 

We now introduce some notation. We denote the observed data by (T, A), 
where T is the observation time and A = (Ai, . . . , A^+i) is an indicator vec- 
tor defined by A^ = 1{X <T,Y = k}foik = l,...,K, and A^+i = 1{X > 
T}. The observed data are illustrated in Figure 1. Let (Tj, A*), i = 1, . . . ,n, 
be n i.i.d. observations of (T, A), where A* = {A\, . . . , Aj^_,_]^). Note that we 
use the superscript i as the index of an observation, and not as a power. 
The order statistics of Ti , . . . , T„ are denoted by T(i) , . . . , Tf^^) ■ Furthermore, 
G is the distribution of T, Gn is the empirical distribution of Tj, i,...,n, 
and Pn is the empirical distribution of (Tj, A'), i = l,...,n. For any vec- 
tor {xi, . . .,xk) G we use the shorthand notation Xj^ — ^2k=i'^kj so 
that, for example, A+ = J2f=iAk and Fq^{s) = X^fcLi -^ofc(s). For any K- 
tuple F = [Fi, . . . ,Fk) of sub-distribution functions, we define Fk+i{s) = 
Iu>s dF^{u) = F+(oo) — F^{s). Finally, we use the following conventions for 
indicator functions and integrals: 

Definition 2.1. Let dA be a Lebesgue-Stieltjes measure. Then we de- 
fine for t <tQ: 

l[to,i)W = -l[i,to)W and / f{u)dA{u) = - f{u)dA{u). 

J[to,t) J[t,to) 

2.1. Definitions of the estimators. We first consider the MLE. To un- 
derstand its form, let F = {Fi, . . . , Fk) G ^k, where J^k is the collection of 
i^-tuples F = (Fi, . . . ,Fk) of sub-distribution functions on M with < 1. 
Under F we have A|T ~ Multi<-+i(l, (Fi(r), . . .,FK+iiT))), so that the den- 
sity of a single observation is given by 
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K+1 K 

(1) PFit,5)= l[Fkit)''= = Y[Fkit)'^il-F+it))'^'\ 

k=l k=l 

with respect to the dominating measure fi = G x where # is the counting 
measure on {e^ :k = 1, . . . ,K + 1} and is the A;th unit vector in R^"*"^. 
Hence, the log likelihood ln{F) = JlogpF{t,S)dFn{t,6) is given by 

(2) UF)= y'|^^5fclogFfc(t) + (l-5+)log(l-F+(t))|dP„(t,5). 

It then follows that the MLE Fn = {Fni, ■ ■ ■ , Fnx) is defined by 

(3) /„(F„)= max 

The naive estimator F„ = (Fni, ■ ■ ■ , Fuk) is defined by 



(4) /nfc(F„fc) = maxinfc(Ffc), k = l,...,K, 

where J- is the collection of all distribution functions on R and lnk{') is 
the marginal log likelihood for the reduced current status data (Tj,A|.), 
i = 1, . . . ,n: 

InkiFk) = J {SklogFkit) + (1 - 4)log(l - Fk{t))}dFn{t,S), 

k = l,...,K. 

Thus, Fnk uses only the A:th entry of the A-vector. We see that the naive es- 
timator splits the estimation problem into K well-known univariate current 
status problems. Therefore, its computation and asymptotic theory follow 
straightforwardly from known results on current status data. But this sim- 
plification comes at a cost. For example, it follows immediately that the 
constraint Fn+ < 1 may be violated (see [7]). 

We note that both Fn+ and provide estimators for the overall failure 
time distribution Fo+ . A third estimator for this distribution is given by the 
MLE for the reduced current status data (T, A+), ignoring information on 
the failure causes. These three estimators are typically not the same (see 
[5]). 

To compare the MLE and the naive estimator, we now define the naive 
estimator by a single optimization problem: 

^ K 

ln{Fn) = max ln{F) where ln{F) = V InkiFk), 

k=i 

and is the K-fold product of J-. By comparing this to the optimization 
problem for the MLE, we note the following differences: 
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(a) The object function ln{F) for the MLE contains the term 1 — 
involving the sum of the sub-distribution functions, while the object function 
ln{F) for the naive estimator only contains the individual components. 

(b) The space J-k for the MLE contains the constraint < 1, while the 
space J-^ for the naive estimator only involves the individual components. 

The more complicated object function for the MLE forces us to work with 
the system of sub-distribution functions, and poses new challenges in the 
derivation of the local rate of convergence of the MLE. Moreover, it gives rise 
to a new self-induced limiting process for the local limiting distribution of the 
MLE (see [3]). The constraint < 1 on the space over which we maximize 
is important for small sample sizes, but its effect vanishes asymptotically. 
These observations are supported by simulations in [3], Section 4. 

2.2. Existence and uniqueness. Since only values of the sub-distribution 
functions at the observation times appear in the log likelihoods lnk{Fk) and 
ln{F), we limit ourselves to estimating these values. This means that the 
optimization problems (3) and (4) reduce to finite-dimensional optimization 
problems. Hence, their solutions exist by [19], Corollary 38.10. 

For the naive estimator, the values of the sub-distribution functions at all 
observation times enter in the log likelihood lnk{Fk)- Together with strict 
concavity of lnk{Fk), this implies that Fnk is unique at all observation times, 
for k = 1, . . . ,K. For the MLE, Fk{Ti) appears in the log likelihood ln{F) 
if and only if -|- A^_|_j^ > 0. This motivates the following definition and 
result: 

Definition 2.2. For each k = l,...,K + 1, we define the set % by 
(5) 71^ = {r„ i = 1, . . . , n : A*, + A*^+i > 0} U {r(„)}. 

Proposition 2.3. For each k = l, . . . ,K + 1, Fnk{t) is unique att£Tk. 
Moreover, Fnk{oo) is unique if and only if ^^^+1 ~ ^ f'^^ '^^^ observations 
with Ti = T(„) . 

Proof. We first prove uniqueness of Fnk{t) at t gTi^, for k = 1, . . . , K. 
Let k G {1, . . . , K}. Strict concavity of the log likelihood immediately gives 
uniqueness of F^k at points Tj with A^ = 1. Note that the log likelihood 
is not strictly concave in Fnk{Ti) if ^^k+i — so that we need to do more 
work to prove uniqueness at these points. First, one can show that Fnk can 
only assign mass to intervals of the following form: 

(i) (Ti, Tj] where /S.'^^-^ = 1, A^ = 1 and Af = A^_^^ = for all I such that 
T,<r^<T,-, 
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(ii) {Ti, oo) where Tj = T(„) and A^_^_;^ = 1 

(see [5], Lemma 1, or use the concept of the height map of [9]). Note that 
Fnk is unique at the right endpoints of the intervals given in (i), since Fnk is 
unique at points Tj with = 1. This imphes that the probabihty mass in 
each interval given in (i) is unique. In turn, this implies that F^k is unique 
at all points that are not in the interior of these intervals. In particular, 
this gives uniqueness of Fnk{t) at t £Tk. The uniqueness statement about 
Fn,K+i follows from the uniqueness of . . . , 

We now prove the statement about F„fc(oo). First, if A^^^ = for all 
observations with Ti = T^^) , then Fnk can only assign mass to the intervals 
given in (i). Hence, F„fc(oo) = F„fc(T'(„)), and since Fnk{T(^n)) was already 
proved to be unique, it follows that F„fc(oo) is unique. Conversely, if there 
is a Ti = T(„) with A}^_|_-^ = 1, then the log likelihood contains the term 

log(l — F+(T(-„))). Hence, Fn+ must assign mass to the right of T^^) in order 
to get ln{Fn) > —oo. The MLE is indifferent to the distribution of this mass 
over Fni, ■ ■ ■ , FnK, since their separate contributions do not appear in the 
log likelihood. Hence, F„fc(oo) is nonunique in this case. □ 

2.3. Characterizations. Characterizations of the naive estimators F^ii , . . . , 
FnK follow from [4], Propositions 1.1 and 1.2, pages 39-41. Characterizations 
of the MLE can be derived from Karush-Kuhn-Tucker conditions, since the 
optimization problem can be reduced to a finite-dimensional optimization 
problem (see the first paragraph of Section 2.2). However, we give charac- 
terizations with direct proofs. These methods do not use the discrete nature 
of the problem, so that they can also be used for truly infinite-dimensional 
optimization problems. 

Definition 2.4. We define the processes Vnk by 
(6) Vnk{t)= f 6kdFn{u,6), tew, k = l,...,K + l. 

Ju<t 

Moreover, let be the collection of A'-tuples of bounded nonnegative 
nondecreasing right-continuous functions. 

Using this notation, we can write ln{F) =J2k=i I^'^sFkiu)dVnk{u)- In 
Lemma 2.5 we translate the optimization problem (3) into an optimization 
problem over a cone, by removing the constraint F^ < 1. Subsequently, we 
give a basic characterization in Proposition 2.6. This characterization leads 
to various corollaries, of which Corollary 2.10 is most important for the 
sequel. 
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Lemma 2.5. Fn maximizes IniF) over J^k if and only if Fn maximizes 
ln{F) over^K, where 

K+l 

ln{F) = X / log Fu{u) dVnk{u) - F+(oo). 

k=l 

Proof. (Necessity.) Let F^ maximize IniF) over J^k, and let F ^Tk- 
We want to show that IniFn) > IniF). Note that this inequality holds triv- 
ially if F+(oo) = 0. Hence, we assume F+(cxo) = c > 0. Then F/c£ Tk, and 
IniFn) > hiiF/c), by the assumption that Fn maximizes IniF) over J^k- 
Together with Fn+ioo) = 1 this yields 

IniFn) = IniFn) -l>lniF/c)-l 
K+l 

= XI / ^°sFkiu)dVnkiu) - logc - 1 
fc=l •' 

= lniF) + C -logc -1> IniF). 

The last inequality follows since x — logx — 1 > for x > 0. 

(Sufficiency.) Let Fn maximize IniF) over Tk, and let F„+(oo) = c. As 
before, we may assume c > 0. Then IniFn) > IniFn/c), and by the same 
reasoning as above this gives IniFn) > IniFn /c) = IniFn /c) - 1 = IniFn) + 
c — logc — 1. Since x — logx — 1 < if and only if x = 1, this yields c = 1. 
Hence, Fn G J^k, and F„ maximizes IniF) over J^x C Tk- □ 

We now obtain the following basic characterization of the MLE. 

Proposition 2.6. Fn maximizes IniF) over J^k if and only if Fn ^ 
and the following two conditions hold for all k = 1, . . . , K: 

(7) / dVM^f dVn,K+liu) ^^^^ 

Ju>t Fnkiu) Ju<t Fn,K+liu) 

f(f dVM^f ^^WM_i\^i?„,(t) = o. 

J [Ju>t Fnkiu) Ju<t Fn,K+liu) J 

Proof. (Necessity.) Let Fn maximize IniF) over Tk- Then F„ also 
maximizes IniF) over Tk^ by Lemma 2.5. Fix /c G {!,..., K}^ and define the 
perturbation F^^^ = (F^^) , . . . , F^) by = (1 + h)Fnk and F^f = Fnj for 
j 7^ k. Since Fn^^ G for |/i| < 1, we get 

0=limh~\UFjf'^)- IniFn)} 
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dVnk{u)+ / ^ — dVn,K+l{u) - Fnk{00) 

J Fn,K+l{u) 



'u>t Fnk{u) Ju<t Fn,K+l{u) 

using Fubini's theorem to obtain the last line. This gives condition (8). 
Next, let t E M, and define the perturbation F^''^'^ = . . . , F^^*^) by 

Fnk'\u) = Fnk{u) + /il[j,oo)(^) and F^^'*) = for j / k. Since € Tk 

for /i > 0, we get 

0>lim/i-iK(F^*))-In(i^«)} 
Alio 

dVnk{u) ^ r dVn,K+liu) _ ^ 



u>t Fnk{u) Ju<t Fn,K+l{u) 

which is condition (7). 

(Sufficiency.) Let F„ G Fx satisfy conditions (7) and (8), and let F G Tk- 
We want to show that ln{Fn) > IniF). Concavity of the logarithm yields 

UF) - Wn) <Y. I ^fe(^) - dVnk{u) - F+(CX3) +F„+(00). 

We now show that the right-hand side of this display is nonpositive. By 
Fubini, we have 



k^i-l Fnk{u) ^^^J Jt<u Fnk{u) 



K 



= 111 1 "^m-Pnm 

f^^^J Ju>t Fnk{u) 

and 

I F,,au)-F K,M ^^^^^^^^^ ^ I I ^^^^ p^^,(,)^»k£±lM 

J Fn,K+l{u) J Jt>u Fn,K+l{u) 

= f^<i(ft-F„.)(*)^ 

^^^J Ju<t Fn^K+l{u) 

Combining the last three displays gives 

^^^J iJu>t Fnk[u) Ju<t Fn^K+l[u) ) 

^f.f(f dVM^f SWiM-lldF,(t)<0, 

^^^J VJu>t Fnk{u) Ju<t Fn,K+l{u) J 
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where the equahty follows from (8), and the final inequality follows from (7). 
Hence Fn maximizes ln{F) over Tk^ and by Lemma 2.5 this implies that F„ 
maximizes ln{F) over Tk- D 

Definition 2.7. We say that t is a pomi o/ increase of a right-continuous 
function F if F{t) > F{t — e) for every e > (note that this definition is 
slightly different from the usual definition). Moreover, for F S we de- 
fine 

(Q\ « _ 1 /■ dVn,K+i{u) 

Note that (3^^ is uniquely defined, since Fn^x+iit) is unique at points t 
where dVn,K+i has mass (Proposition 2.3). We now rewrite the characteri- 
zation in Proposition 2.6 in terms of (3^^ : 

Corollary 2.8. Fn maximizes ln{F) over Tk if and only if Fn 
and the following holds for all k= 1, . . . ,K: 

where equality holds if t is a point of increase of Fnk- 

Proof. Since the integrand of (8) is a left-continuous function of t, 
conditions (7) and (8) of Proposition 2.6 are equivalent to the condition 
that for all /c = 1, . . . , ii', 



u>t Fnk{u) 



Ju<t Fn^K+l{u) 



where equality must hold if t is a point of increase of F^k- Combining this 
with 

f dVn,K+liu) _ ^ ^ I dVn,K+l{u) 

Ju<t Fn,K+l{u) "-^^ Ju>t Fn,K+l{u) ' 

completes the proof. □ 

We determine the sign of /? 9 in Corollary 2.9: 

Corollary 2.9. Let Fn maximize ln{F) overJ^x- Then P^-^ > 0, and 
P^-p =0 if and only if there is an observation with Ti = T(^n) (^nd ^^x+i ~ 1- 
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Proof. Taking t > r(„) in Corollary 2.8 implies that jS^p > 0. Now 
suppose that there is a Tj = T(^n) with = 1. Then we must have 

F„+(T(„)) < 1 to obtain ln{Fn) > —oo. Hence, there must be a A; € {1, ... , K} 
such that Fnk has points of increase t > T(^n) - Corollary 2.8 then implies that 
f3^p = 0. Next, suppose that there does not exist a Tj = T(„) with A^^-^ = 1. 
Then 

' dVnk{u) dVn,K+l{u)\ _ f dVnk{u) 



/m>T(„) I Fnk{u) Fn,K+l{u) J Ju>T^„) Fnk{u) 

and by Corollary 2.8 this implies (3^<^ > 0. □ 

We now make a first step toward localizing the characterization, in Corol- 
lary 2.10. This corollary forms the basis of Proposition 4.8, which is used in 
the proofs of the local rate of convergence and the limiting distribution of 
the MLE. 

Corollary 2.10. F^ maximizes ln{F) over if and only if Fn ^Tk 
and the following holds for all k = 1, . . . , K and each point of increase r^k 
of Fnk-- 

where equality holds if s is a point of increase of Fnk, o,nd if s > T^^-j . 

Proof. Let Fn maximize over Tk- Let s > Tnk- If Tnk < s < T^^n)-, 
then (11) follows by applying (10) to t = Tnk and t = s, and subtracting the 
resulting equations. If Tnk < Tjn) < •s, then 



L 



f dVnkiu) dVn,K+liu) \ _ r f dVnkju) dVn,K+l{u) 

-„k,s)i Fnk{u) Fn,K+l{u) J Ju>Tnk\ Fnk{u) Fn^K+l{u) 



SO that the statement follows by applying (10) to t = Tnk- If < Tnk < 
s, then the left-hand side of (10) equals zero for t = Tnk and t = s. The 
inequalities for s < Tnk can be derived analogously. Finally, the inequality 
(11) and the corresponding equality condition imply (10). □ 



3. Consistency. Hellinger and Lr{G) (r > 1) consistency of the naive es- 
timator follow from [13, 18]. Local consistency of the naive estimator follows 
from [4, 13]. In this section we prove similar results for the MLE. First, note 
that for two vectors of functions F = (Fi, . . . , Fk) and Fq = (Fqi, . . . , Fqk) 
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in Tk^ the Hellinger distance h{pF,pFo) and the total variation distance 
d-TviPFiPFo) in our model are given by 

K+l 



(12) h\pF,pF^) = \ J (vp^ - vm)' dfi=lJ2 J - V^)' dG, 

k=l 

K+l 

(13) dTY {PF,PFo) = kJ2 J \ Fk - Fok I dG, 



where ijl = G x and pF and # are defined in (1). The MLE is Hellinger 
consistent: 

Theorem 3.1. h{p^ ,pFo) ^a.s. 0. 

Proof. Since V = {pF'-F £ J^k} is convex, we can use the following 
inequality: 

h^P^ ,PFo) < (B„ - P)HPf /pFo), 

where (p(t) = (t - l)/{t + 1) ([18], Proposition 3; see also [11] and [14, 15]). 
Hence, it is sufficient to prove that {cj){pF /pfq) -F G J'k} is a P-Glivenko- 
Cantelli class. This can be shown by Glivenko-Cantelli preservation theo- 
rems of [18], using indicators of yC-classes of sets and monotone functions 
as building blocks. Alternatively, the result follows directly from [18], The- 
orem 9 by viewing the problem as a bivariate censored data problem for 
{X,Y). □ 

Lr{G) consistency is given in Corollary 3.2, where the Lr{G) distance is 
defined by 

K+l 

(14) ||F-Fo||^,,= ^ \Fk{t)-Fok{t)rdG{t), r > 1. 

k=i ■' 



Corollary 3.2. - Follcr ^a.s. forr>l. 

Proof. Note that ||F — -FqIIg,! = '^dTy{pF,PFo)- Hence, the statement 
for r = 1 follows from the well-known inequality d-j^\(pF^,pF^) < \/2/i(pfhPF2)- 
The result for r > 1 follows from \a — feC <\a — b\ for a,b £ [0, 1] and r > 1. 
□ 



Note that Theorem 3.1 and Corollary 3.2 hold without any additional 
assumptions. The quantities in these statements are integrated with respect 
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to G, showing the importance of the observation time distribution. For ex- 
ample, the results do not imply consistency at intervals where G has zero 
mass. Such issues should be taken into account if G can be chosen by design. 

Under some additional assumptions, Maathuis ([10], Section 4.2) proved 
several forms of local and uniform consistency using methods from [13], Sec- 
tion 3. One such result is needed in the proof of the local rate of convergence 
of the MLE, and is given below: 

Proposition 3.3. Let Fqi, . . . , Fqk be continuous at t^, and let G he 
continuously differentiable at with strictly positive derivative g{to). Then 
there exists an r > such that 

sup \Fnk{t)-Fok{t)\^a.s.O, k=l,...,K. 

tG[to-r,to+r] 

Proof. Let k £ {1, . . . , K} and choose the constant r > such that -Fq/j 
is continuous on [to — 2r, to + 2r] and g{t) > g{tQ)/2 for t G [to — 2r, to + 2r]. 
Fix an lo for which the Li{G) consistency holds, and suppose there is an 
xq S [to — i'-,'to + A for which Fnk{xo,uj) does not converge to Fq^^xq). Then 
there is an e > such that for all ni > there is an n > ni such that 
iFnki^Oi'-^) — Fok{xo)\ > ^- Using the monotonicity of F^k and the continuity 
of Fok, this implies there is a 7 > such that \Fnkit,u;) — FQk{t)\ > e/2 for 
aU t e (xo - 7i xo] or [xo, xo + 7) and [xo - 7, xo + 7] C [to - 2r, to + 2r]. This 
yields that / |F„fc(t, w) — i*ofc(t)| dG{t) > 7e(7(to)/4, which contradicts Li{G) 
consistency. Uniform consistency follows since Fq^. is continuous. □ 

4. Rate of convergence. The Hellinger rate of convergence of the naive 
estimator is n^^^. This follows from [15] or [17], Theorem 3.4.4, page 327. 
Under certain regularity conditions, the local rate of convergence of the 
naive estimator is also n^/^; see [4], Lemma 5.4, page 95. This local rate 
result implies that the distance between two successive jump points of F^k 
around a point to is of order Op{n~^^^). 

In this section we discuss similar results for the MLE. In Section 4.1 we 
show that the global rate of convergence is n^/^. In Section 4.2 we prove that 
n^/^ is an asymptotic local minimax lower bound for the rate of convergence, 
meaning that no estimator can converge locally at a rate faster than n^/^, in 
a minimax sense. Hence, the naive estimator converges locally at the optimal 
rate. Since the MLE is expected to be at least as good as the naive estimator, 
one may expect that the MLE also converges locally at the optimal rate of 
n^/^. This is indeed the case, and this is proved in Section 4.3 (Theorem 
4.17). Our main tool for proving this result is Theorem 4.10, which gives a 
uniform rate of convergence of Fn-\- on a fixed neighborhood of a point, rather 
than on the usual shrinking neighborhood of order n~^^^. Such a strong rate 
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result is needed to handle potential sparsity of the jump points of the MLEs 
of the sub-distribution functions (see Remark 4.18). Some technical proofs 
are deferred to Section 5. 

4.1. Global rate of convergence. 

Theorem 4.1. n^/^h{p- ,pfo) = Op{l). 

Proof. We use the rate theorem of Van der Vaart and Wellner ([17], 
Theorem 3.4.1, page 322) with 



mpp{t,5) =log 



PF{t,S) +PFo{t,S) 
2pF,{t,6) 



Mn{F) = P„mp^, M{F) = Pmp^ and G„mp^ = V^(M„ - M)(F). The key 
condition to verify is £'||Gn||x-y ^ 4'n{l)-, where Ai^ = {rripp — rupp^ : h{pF, 
PFq) < 7} and (j)n{l)/'y°' is a decreasing function in 7 for some a < 2. For 
this purpose we use Theorem 3.4.4 of [17], which states that the functions 
nipp fit the setup of Theorem 3.4.1 of [17], and that 

(15) E\\Gn\\M, < Ml,r, h){l + Jn(7,P, /i)7-2n-i/2}, 



where J[](7,7^,/i) = Jq + log Njj {e, V, h) de and logiV[](e,-p, /i) is 
the e-entropy with bracketing for V = {pf '■ F £ ^k} with respect to Hellinger 
distance h. We first bound the bracketing number N\^{e,V,h). Let F = 
{Fi, . . . , Fk) G J^k- For each k = 1, . . . , K + 1, let [lk,Uk] be a bracket con- 
taining Fk, with size / (y/uk — Vh)"^ dG < e^/ {K + 1). Then 



[Piit,S),Pu{t,S)] 



■K+l K+l 
.k=l k=l 



is a bracket containing pF, and its Hellinger size is bounded by e. 

Note that all F^, k = 1, . . . ,K + 1, are contained in the class = {F : M 1— > 
[0, 1] is monotone}, and it is well known that logN^j{6,J^, L2{Q)) < 1/6, uni- 
formly in Q. Hence, considering all possible combinations of {K + l)-tuples 
of the brackets [lk,Uk], it follows that 

logiV[](e,P,/i) < log({iV[](e/^/ZTT,.^,L2(G))}^+^) 

= (K + 1) log N[]{e/VKTl, J', L2iG)) < (K + lf/h^'. 

Dropping the dependence on K (since K is fixed), this implies that J[] (7, V, h) < 
7-^/^, and together with (15) we obtain < y/j + {-jy/n)'^. Since 

7 ^ (\/7 + (7\Ai)"^)/7 is decreasing in 7, it is a valid choice for 4>n{l) 
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in Theorem 3.4.1 of [17]. We then obtain that r„7i(pc; ,pfo) = Op(l) pro- 
vided that h{pc^ ^PFn) — > in outer probabihty, and r'^4>n{r~^) < -y/n for all 
n. The first condition is fulfilled by the almost sure Hellinger consistency 
of the MLE (Theorem 3.1). The second condition holds for r„ = cn}^^ and 
c = ((V5-l)/2)2/3. □ 

We obtain the following corollary about the Li{G) and L2{G) rates of 
convergence: 

Corollary 4.2. n^/^\\Fn - FoWcr = Op{l) for r = 1,2. 

Proof. The result for r = 1 again follows from dTv(PFnPF2) 
< \/2h{pFi,PF2)- The result for r = 2 follows from 
K+l 

II^-^o|Ig,2= E {^k-^f{^k + ^fdG<^h\pF,PFo), 
k=i 

using + V7ofc<2. □ 

4.2. Asymptotic local minimax lower bound. In this section we prove that 
n^/3 is an asymptotic local minimax lower bound for the rate of convergence. 
We use the set-up of [1], Section 4.1. Let V he a set of probability densities 
on a measurable space (il., A) with respect to a fj-finite dominating measure. 
We estimate a parameter 9 = Up G M, where [/ is a real- valued functional and 
pGV. Let n > 1, be a sequence of estimators based on a sample of size 
n, that is, Un = tn{Zi, . . . , Z^), where Zi, . . . , Z„ is a sample from the density 
p, and tn '■ — > M is a Borel measurable function. Let I : [0, oo) — > [0, oo) be 
an increasing convex loss function with /(O) = 0. The risk of the estimator 
Un in estimating Up is defined by En,pl{\Un — Up\), where En,p denotes the 
expectation with respect to the product measure P®" corresponding to the 
sample Zi, . . . , Zn- We now recall Lemma 4.1 of [1]. 

Lemma 4.3. For any pi,p2 G V such that the Hellinger distance 
infmax{£'„,p^/(|CZ„ - Upi\), En,p2l{\Un - Up2\)} 

>l{\\Up^-Up2\{l-h\pi,p2)f''). 

Let k E and let n > 1, be a sequence of estimators of 

FQk{to)- Furthermore, let c > and let F!^ = {Fni, ■ . ■ ,FnK) be a perturba- 
tion of Fq where only the kth component is changed in the following way: 

r Fofc(io-cn-V3), if xe [to -cn-i/3, to), 
Fnk{x) = l Fok{to + cn-y^), if [to,to + cn-V3), 
LFofc(x), otherwise, 
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and Fnj{x) = F(jj{x) for j ^ k. Note that G Tk is a valid set of sub- 
distribution functions with overall survival function Fnj^^i = 1 — Fn+- 

We now apply Lemma 4.3 with l{x) = x'^ , pi = Pfq and p2 = Ppk, where 
Pf is defined in (1). This gives a local minimax lower bound for the rate of 
convergence. A detailed derivation of this result is given in [10], Section 5.2. 



Proposition 4.4. Fix k e {I,.. .,K}. Let < -Fofc(io) < ^ofc(oo), and 
let Fok and G be continuously differentiable at to with strictly positive deriva- 
tives fok{to) and gito). Let d = 2~^^^e~^^^ . Then, for r > 1, 

liminfn''/^inf max{£'n,pK iC/nA: - FQk{tQ)Y -.En^p AUnk - -^nfe(io)r} 
(16) >d^\M.l^+ ' 



fok{to)\Fok{to) l-Fo+(tc 

Remark 4.5. Note that the lower bound (16) consists of a part de- 
pending on the underlying distribution, and a universal constant d. It is 
not clear whether the constant depending on the underlying distribution is 
sharp, because it has not been proved that any estimator achieves this con- 
stant. However, we do know that the naive estimator F^k does generally not 
achieve this constant. To see this, recall that F^k is the MLE for the reduced 
data {Ti,A\), i = 1, . . . ,n. Hence, its asymptotic risk is bounded below by 
the asymptotic local minimax lower bound for current status data: 

-r/3 



gito) r 1 ^ 



Jokito){Fok{to) l-Fofc(to) 

(see [1], (4.2), or take K = 1 in Proposition 4.4). Since 1 - Fok{to) > 1 - 
Fo+(to) if FQj{tQ) > for some j £ {1,. . . , K}, j / k, this bound is larger 
than the one given in (16). 

4.3. Local rate of convergence. As mentioned in the introduction of this 
section, the n^/^ local rate of convergence of the naive estimator and the 
n^/^ local minimax lower bound for the rate of convergence suggest that the 
MLE converges locally at rate n^/^. This is indeed the case, and we now give 
the proof of this result. However, although this result is intuitively clear, the 
proof is rather involved. 

The two main difficulties in the proof are the lack of a closed form for 
the MLE and the system of sub-distribution functions. We solve the first 
problem by working with a characterization of the MLE in terms of necessary 
and sufficient conditions. This approach was also followed in [1] for case 2 
interval censored data, and in [2] for convex density estimation. We handle 
the system of sub-distribution functions by first proving a rate result for 
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-F„+ that holds uniformly on a fixed neighborhood around to, instead of on 
the usual shrinking neighborhood of order 

The outline of this section is as follows. In Section 4.3.1 we revisit the 
characterization of the MLE, and derive a localized version of the conditions 
(Proposition 4.8). In Section 4.3.2 we use this characterization to prove the 
rate result for that is discussed above (Theorem 4.10). In Section 4.3.3 
we use this result to prove the local rate of convergence for the components 
Fni, ■ ■ ■ ,FnK (Theorem 4.17). Some technical proofs are deferred to Section 
5. 

Throughout, we assume that for each k G {1, . . . , K}, F^k is piecewise con- 
stant and right-continuous, with jumps only at points in 7^ (see Definition 
2.2). This assumption does not affect the asymptotic properties of the MLE. 

4.3.1. Revisiting the characterization. We consider the characterization 
given in Corollary 2.10. Since it is difficult to work with Fnk in the denomi- 
nator, we start by rewriting the left-hand side of (11), using 

dVnk{u) _ f dVnk{u) ^ r Fokju) - Fnk{u) ^^^^ 



l[s,t) Fnk{u) J[s,t) Fok{u) J[s,t) Fok{u)Fnk{u) 

This leads to the following lemma: 

Lemma 4.6. For all k = l,. .. ,K and s,t£R, 

f dVnkju) _ dVn,K+liu) 
,t) \ Fnk{u) Fn,K+l{u) 

(dVnk{u) dVn,K+liu) 



,t)\ Fok{u) Fo^K+i{u) 
^ f Fokju) - Fnkju) 

+ / dVnk[U) 

J[s,t) Fok{u)Fnk{u) 

Fo^K+l{u) - Fn,K+l{u) 
aVn K+l[u). 

ls,t) Fo,K+l{u)Fn,K+l{u) 

We now combine Corollary 2.10 and Lemma 4.6 to obtain a localized 
version of the characterization in Proposition 4.8. We first introduce some 
definitions: 

Definition 4.7. Let Uk = iFok{to))~^ for /c = 1, . . . ,K + 1. Furthermore, 
for k = 1, . . . , K , we define the processes VFnfc(') and S'nfc(') by 



(17) Wnkit) = {6k- Fokiu)} dFniu, 6), 

Ju<t 

(18) Snk{t) = akWnk{t) + aK+lWn+{t). 



(19) 
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Proposition 4.8. For each k = l,.. .,K, letO < Fofc(to) < -^ofc(oo), and 
let Fofc G be continuously differentiable at to with strictly positive deriva- 
tives fokito) and gito). Then there is an r > such that, for all k = 1, . . . , K 
and each jump point Tnk < 7'(n) of Fnk , we have 

I {ak{Fnk{u) - Fok{u)} + aK+i{Fn+{u) - Fo+{u)}} dG{u) 

< / dSnk{u) + RnkiTnk, s) for S < T(^^), 

where equality holds in (19) if s is a jump point of Fnk, o-nd where 
fn,r,\ \Rnkis,t)\ , , 

Proof. Let k £ {1, . . . ,K} and let Tnk < T(^n) be a jump point of F^k- 
Note that Corollary 2.10 and Lemma 4.6 imply that for all s < Tf^n)^ 

Fnkiu) - Fo.(n) ^^^^^^^ 

Kfc.s) FQk{u)Fnk{u) 

/on [ Fn,K+iiu) - Fp^K+lju) 

JuG[Tr,k,s) I Fok{u) Fo^K+l{u) ) 

with equality if s is a jump point of Fnk- We first consider the left-hand 
side of (21). For each A; G {1, . . . , + 1}, we replace Fnk{u) by FQk{u) in the 
denominator: 



(22) 



Fnkiu) -F,k{u) ^^^^^^^ 

[s,t) FQk{u)Fnk{u) 



Fnkju) -Fokju) (1) 
Fhi^i^^ dVnk{u)+pl,>{s,t), 

(23) where ,2(M) = - / . " fo.(")}^ K.(.). 



M Fok{u)'^Fnk{u) 

Next, we replace dVnk{u) by dVk{u) = -Fofc(u) dG{u) in the first term on the 
right-hand side of (22): 

(OA\ f ^nk{u) - Fqi,{u) Fnk{u) - Fok{u) (2), . 

(2^) /. . ^T:A2 dVnk{u)= r—— dG{u)+pll{s,t), 



s,t) FQk{uY Js Fok{u) 
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(25) where p'Jis,t) = / d{Vnk - Vk){u). 

J[s,t) ^ok[ur 

Finally, we replace the denominator FQk{u) by -Fofc(io) in the first term on 
the right-hand side of (24): 

and similarly on the right-hand side of (21): 

where, with Gn the empirical distribution of Ti, . . . , r„ (as defined in Section 
2), 

Jue[s,t) rok{u)rok[to) 

(27) =/ ^^^-f-^d(K.-T4)(n) 

J[s,t) Fok{u)Fok{to) 

Fo.(to)-Fo.(n) ^(^^_^^^^^^ 

s,t) ^Ok{to) 

Inequality (19) then follows from Fk+i = 1 — F4. for F G Tk, and the defi- 
nition 

4 4 

(28) i2„fc(s, t) = 5] p^^^K^, is, t)-Y,p^^l{s,t), k = l,...,K. 

1=1 e=i 

We now show that the remainder term Rnk{s,t) is of the given order. 

Let k {1, . . . , K + I}, and consider p^^^. Note that Fnk and Fq^ stay away 
from zero with probability tending to 1 on [to — 2r, Iq + 2r] , by the assump- 
tion -Fofc(^o) > 0, the continuity of Fq^ at to, and the consistency of Fnk 
(Proposition 3.3). Furthermore, 

{Fnk{u) - Fokiu)}"^ dVnk{u) 

< J {Fnk{u) - Fok{u)Y d{Gn - G){u) + j {Fnk{u) - Fok{u)f dG{u), 

where the second term on the right-hand side is of order Op{n~'^/'^) by the 
L2{G) rate of convergence given in Corollary 4.2, and the first term is of 
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order Op{n ^/^) by a modulus of continuity result. To see the latter, define 
Q = {qp(u) = {F{u) - Fofc(n)}2 : F G .^}, 
Q(7) = [qFeQ:J qpiuf dG{u) < 7^ 

where J- is the class of monotone functions F '.M^ [0, !]• The L2{G) rate of 
convergence (Corollary 4.2) implies that we can choose C > such that qp £ 
Q(Cn~^/^) with high probability. We then apply (5.42) of [16], Lemma 5.13, 
with a = 1 and /3 = to the class Q{Cn-^/^). This yields that p^ki^,*) = 
Op{n~'^/^) uniformly in to — 2r < s < t < to + 2r. Analogously, p^^l{s,t) = 

Op{n~'^^^) uniformly in to — 2r < s < t < to + 2r, using the L2{G) rate of 

(s) 

convergence and a modulus of continuity result. Next, we consider p)^l{s,t). 
By the Cauchy-Schwarz inequality. 

The first term of the product is of order 0(t — s)^/^, uniformly in to — 
2r < s < t < to + 2r, by the continuous differentiability of Fq^. The sec- 
ond term is of order Op(n~^/^) by the L2{G) rate of convergence. Hence, 
p^^^ (s, t) = Op{n~^/^{t - sf^^), uniformly in to - 2r < s < t < to + 2r. Finally, 
Pnk(^^ t) = Op{n~^/'^{t — s)), uniformly in to — 2r < s < t < to + 2r, by writing 
/[s t) ~ /[s to) ~ hi *o) ^"^^ using Lemma 4.9 below. Since the term Op{n~^/'^{t — 
s)) is dominated by Op(n~^/^ Vn~-'^/'^(t — s)^/^) for all s < t, it can be omit- 
ted. □ 

Lemma 4.9. Let F : R — > M he continuously differentiahle at to with deriva- 
tive f{tQ) > 0. Then there is an r > so that uniformly in to — 2r < s < t < 
to + 2r, 

(29) / {F{t)-Fiu)}d{Gn-G)iu)=Opin'^/\t-s)), 

J[s,t) 

(30) / ^^^^J/^""^ diVrrk - Vk){u) = Op(n-V2(t _ s)), 

J[s,t) ^ [U) 

k = l K. 
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Proof. We only prove (29), because the proof of (30) is analogous. 
Integration by parts yields 



„i/2 / {F{t)-F{u)}diGn-G)iu) 

J[s,t) 



'{F{t) - F{s)}{Gn{s) - G{s)} + f {Gn{n) - G{u)} dF{u 

J[s,t) 

Note that n-^/^sup^gjg |G„(n) — G{u)\ is tight, since it converges in distribu- 
tion to sup^gK \B{G{u))\ < sup^g[Q |i?(x)|, where i? is a standard Brownian 
motion on [0, 1]. Hence, both terms on the right-hand side of the display are 
Op{l){F{t) - F{s)} = Op{t - s), uniformly in to - 2r < s <t < to + 2r. □ 

4.3.2. Uniform rate of convergence of Fn+ on a fixed neighborhood of to- 
The main result of this section is a rate of convergence result for which 
holds uniformly on a fixed neighborhood [to — r,tQ + r] of to, rather than on 
a shrinking neighborhood of the form [to — Mn~^^^ ,to + Mn~^^^] (Theorem 
4.10). We discuss the meaning of this result in Remark 4.11, by comparing 
it to several existing results for current status data without competing risks. 
Theorem 4.10 is used in Section 4.3 to prove the local rate of convergence 
of the components Fni, . . . , FnK- 

Theorem 4.10. For all k = 1, . . . ,K , let < Fofc(to) < -^ofe(oo), and let 
Fok and G be continuously differentiable at to with strictly positive deriva- 
tives fokiio) and g{to)- For (3 G (0,1) we define 

/QT^ if\t\<n-^/^, 
^•^^^ ''"^*^-\n-(i-W3|^|/3^ ^/|^|>n-V3. 

Then there exists a constant r > so that 

(32) sup — — = Op(l). 

te[to-r,to+r] Vn[^-l0) 

Note that the function Vn{t) = n~^/^ for |t| < n~^/^. Outside a n"-*^/^ 
neighborhood we cannot expect to get a n~^/^ rate. Therefore, for t > 
we let the function f„,(t) grow with t, by defining Vn{t) = n~^^~^^/^\t\^ . 

Before giving the proof of Theorem 4.10, we discuss its meaning by com- 
paring it to several known results for current status data without competing 
risks. 



Remark 4.11. By taking K = 1 in Theorem 4.10, it follows that the 
theorem holds for the MLE F„ for current status data without competing 
risks. Thus, to clarify the meaning of Theorem 4.10, we can compare it to 
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Note that p close to zero gives the sharpest bound. 

known results for F^. First, we consider the local rate of convergence given 
in [4], Lemma 5.4, page 95. For Af > 0, they prove that 

(33) sup |F„(to + n-^/=^i)-^o(to)| = Op(n"^/^). 

t£[-M,M] 

We can obtain this bound by applying Theorem 4.10 to t G [to — Mn~-^/^, Iq + 
Mn~^/^], and using the continuous differentiability of Fq^ at to a-nd the fact 
that 

Vn{t - to) < Vn{Mn^^/^) = M^n~^''^ 

for M > 1, t e [to - Mn-^/^^o + Mn'^/^ 

Hence, Theorem 4.10 implies (33) for M > 1. 

Next, we consider the global bound of [4], Lemma 5.9: 

(34) sup \Fn{t)-Fo (t) \ = Op{n-^/Hogn). 

The result in Theorem 4.10 is fundamentally different from (34), since it is 
stronger than (34) for |t — to| < n~'^/^{\ognY^^ , and it is weaker outside this 
region. 

Remark 4.12. Note that Theorem 4.10 gives a family of bounds in (3. 
Choosing (3 close to zero gives the tightest bound, as illustrated in Figure 
2. For the proof of the local rate of convergence of -F„i) • ■ • ) FnK (Theorem 
4.17), it is sufficient that Theorem 4.10 holds for one arbitrary value of 
/3 G (0, 1). Stating the theorem for one fixed (3 leads to a somewhat simpler 
proof. However, for completeness we present the result for all [3 G (0, 1). 

As an introduction to the proof of Theorem 4.10 we first note the follow- 
ing. Let e > and let r > be small. Then the continuous differentiability 
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of -Fo+ at to implies 

Fo+{t + Mvn{t-to))<Fo+{t) + 2Mvn{t-to)fo+{to), te[to-r,to + r], 

Fo+{t - Mvn{t - to)) > Fo+{t) - 2Mvn{t - to)fo+{to), te[to- r,to + r]. 

Hence, it is sufficient to show that we can choose rii and Af such that for 
all n> Til 

P{3t G [to - r, to + r] : ^ (Fo+(t -Mvn{t-to)), 

Fo+{t + Mvn{t-to)))}<e. 
In fact, we only prove that there exist rii and M such that 
(35) P{3te[to,to + r]:Fn+{t)>Fo+{t + Mvn{t-to))}<e/A, n>ni, 

since the proofs for -Fn+(^) < -^o+(* ~ Mvn{t — to)) and the interval [to — r, to] 
are analogous. In the proof of (35) we use the fact that we can choose r, rii 
and C such that P{E^^(^) < e/8 for all n > ni, where 

EnrC = n 1 ^"'^ ^ j^™P ™ ~ ^0-r), T(„) > to + 2r, 
fc=i 

(36) 

SU \Rnk{w,t)\ ^ ^ 

to-2r<w<t<to+2r n'^/S V n-V3(t _ 'u;)3/2 - 

and Rnk{w,t) is defined in Proposition 4.8. For the event involving R^k this 
follows from Proposition 4.8. For the event that F^k has a jump point in 
(to — 2r, to — r) , this follows from consistency of Fnk (Proposition 3.3) and the 
strict monotonicity of Fok in a neighborhood of to . Finally, T(,„) > to + 2r for 
sufficiently large n follows from the positive density of 5 in a neighborhood 
of to. 

Proof of Theorem 4.10. By the discussion above, and by writing 
P{3tG [to,to + r]:F„+(t) >Fo+(t + Mz;„(t-to))} 
<P{Krc) 

(37) + P(3t e [to, to + r] : F„+(t) > Fo+(t + Mvn{t - to)), Krc), 

it is sufficient to show that we can choose ni , M and C such that the second 
term of (37) is bounded by e/8 for all n> ni. In order to show this, we put 
a grid on the interval [to, to + r], analogously to [8], Lemma 4.1. The grid 
points tnj and grid cells Inj are denoted by 

(38) tnj = to + jn~^^^ and Inj = [tnj,tn,j+l) 

for j = 0, . . . , Jn = _ 
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This yields 

P{3te[to,to + r]:Fn+{t)>Fo+it + Mvn{t-to)), E^rc) 

Jfi 

(t) > Fo+{t + MVn{t - to)),Enrc)- 

j=0 

Hence, it is sufficient to show that we can choose ni and mi such that for 
all n > ni, M > mi and j = 0, . . . , J„, we have 

(39) P{3t G Inj : Fn+{t) > Fo+{t + Mt;„(t - to)), ^nrc) < PjM, 

where pjM satisfies limsup^^oo X]j=oPiA/ — > as M — > oo. We prove (39) 
for 

(An) fc^iexp{-(i2M3}, ifj = 0, 

^^"^ - I d, exp{-d2{Mfr}, if j = 1, . . . , Jn, 

where di and d2 are positive constants. Using the monotonicity of -Fn+i it 
is sufficient to prove that for all n> ni, M > mi and j = 0, . . . , Jn, 

(41) P{AnjM, Enrc} < PjM, 

where 

(42) AnjM = {Fn+{tn,j+l) > Fo+(s„jM)}, 

(43) SnjM = tnj + MVn{tnj -to). 

Fix n > and M > 0, and let j G {0, . . . , J„}. Let Tnkj be the last jump 
point of Fnk before tnj+i, ior k = l,. . . ,K. On the event EnrC, these jump 
points exist and are in (to — 2r, tnj+i] ■ Without loss of generality we assume 
that the sub-distribution functions are labeled so that Tnij < • • • < TnKj ■ 
On the event AnjM there must be a A: G {1, . . . ,K} for which Fnk{tn.j+i) ^ 
Fok{snjM)- Hence, we can define ^ G {1, . . . ,K} such that 

(44) Fnk{tn,j+l) <Fok{snjM), k = £+l,...,K, 

(45) Fne{tn,j+l) > FQi{SnjM)- 

Since SnjM < to + 2r for n large, and to + 2r < T(„) on the event EnrC-, we 
have 

{ai{Fn£{u) - Foi{u)} + aK+i{Fn+{u) - Fo+{u)}} dG{u) 



< / dSne{u) + Rne{Tnej,SnjM) 
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by Proposition 4.8. Hence, P{AnjM, Enrc) equals 

P( / {ae{E^e{u)-Foe{u)} + aK+i{Fn+{u)-Fo+{u)}}dG{u) 

< / dSrd{u) + Rn£{Tnij,SnjM), AnjM, EnrC j , 

and this is bounded above by 



pi / ae{Fni{u) - Foe{u)} dG{u) - / dSni{u) 

< Rne{Tnej,SnjM), ^njM, EnrC 



(46) 



(47) + P / {Fn+{u) - Fo+{u)} dG{u) < 0, A^jm, E^rC ■ 

We now show that both terms (46) and (47) are bounded above by ■pjul'^- 
Note that (45) impHes that on the event AnjM, 

Fne{u) > FniiTnej) = Fne{tn,j+l) > Foi{SnjM) for U > Tnlj, 

using the definition of r„£j, and the fact that F^e is piecewise constant and 
monotone nondecr easing. Hence, on the event AnjM we have 

{Fni{u) - FoKn)} dG{u) > / {Fq^s^ja/) - Foi{u)} dG{u) 

'nlj •^Tntj 

> \g{to)foe{to)iSnjM - Tnlj)^, 

for all Tnij G [to ~ 2r, tn j+i] and r sufficiently small. Combining this with 
the definition of EnrC [see (36)], it follows that (46) is bounded above by 



(48) 



P\ , inf Ah{to)aefoe{to){snjM -wf - dSne{u) 



For mi and rii sufficiently large, this probability is bounded above by Pjm/'^ 
for all M > mi, n> ni and j £ {0, . . . , Jn}, using Lemma 4.13 below. Simi- 
larly, (47) is bounded above by pj^f^, using Lemma 4.14 below. This proves 
(41) and completes the proof. □ 

Lemmas 4.13 and 4.14 play a crucial role in the proof of Theorem 4.10. 
The probability statement in Lemma 4.13 consists of three terms: a de- 
terministic parabolic drift b{snjM — w)"^ ■, a martingale Snk-, and a remainder 
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term C{n~'^^^ V n~^^^{snjM — w)^^'^). The basic idea of the lemma is that the 
quadratic drift dominates the martingale and the remainder term. Lemma 
4.14 controls the term that involves the sum of the components. In this 
lemma the key idea is to exploit the system of sub-distribution functions, 
and play out the different components against each other. The proofs of 
both lemmas are given in Section 5. 

Finally, we note that (48) in the proof of Theorem 4.10 contains a smaller 
remainder term C(?i~^/^ \^ n~^/'^{snjM — w)^^'^) than the one in Lemma 4.13. 
Hence, (48) is also bounded above by Pjm- We choose to state Lemma 4.13 
in terms of the larger remainder term C{n~'^^^ V n~^^^{snjM — w)^^'^), since 
we need the lemma in this form for the proof of Theorem 4.17. 

Lemma 4.13. Let C > and b> 0. Then there exist r > 0, ni > and 
rui > such that for all k = 1, . . . , K , n> ui, M > mi and j G {0, . . . , J„ = 

P[ inf \b{snjM-wf- dSnk{u) 

where SnjM = tnj + Mvn{tnj — to), CLnd Snk{-), Vn{-) and pjM are defined by 
(18), (31) and (40), respectively. 

Lemma 4.14. Let the conditions of Theorem 4-10 be satisfied, and let i 
be defined by (44) and (45). Then there exist r > 0, ni > and mi > such 
that for all n> ni, M > mi and j G {0, . . . , J„ = [rn^/^]}, 

P< / {Fn+{u) - Fo+{u)} dG{u) < 0, AnjM-, EnrC \ < PjM, 

where Tntj is the last jump point of F^e before SnjM = t^j + Mvn{tnj — 

to), and EnrC, PjM and AnjM are defined by (36), (40) and (42), respec- 
tively. 

Remark 4.15. The conditions of Theorem 4.10 also hold when to is 
replaced by s, for s in a neighborhood of to- Hence, the results in this 
section continue to hold when to is replaced by s G [to — r, to + r] , for r > 
sufficiently small. To be precise, there exists an r > such that for every 
e > there exist C > and ni > such that 

/ |F„+(t)-Fo+(t)| ^ ^ 
P sup ^ >C 

VtG[to-r,to+»-] Vn\l — S) 

for s G [to — r, to + r] , n > rei . 




26 P. GROENEBOOM, M. H. MAATHUIS AND J. A. WELLNER 



In Remark 4.12 we already mentioned that, in order to prove the local rate 
of convergence of the components Fni, ■ ■ ■ , F^k, we only need Theorem 4.10 
to hold for one value of (3 € (0,1). Therefore, we now fix (5 = 1/2 so that 
Vn{t) = V n-^/'^VW . 

Then Remark 4.15 leads to the following corollary: 

Corollary 4.16. Let the conditions of Theorem 4.10 he satisfied. Then 
there exists an r > such that for every e > there exist C > and ni > 
such that 

p( sup \Ilii^p^I^iMl^,c)<. 

Wlto~r,s] n-2/3Vn-l/6(s-t)3/2 J 

for s £ [to — r,tQ + r], n > ni. 

4.3.3. Local rate of convergence of Fni, ■ ■ ■ , F^k ■ We are now ready to 
prove the local rate of convergence of Fni, ■ ■ ■ , FnK- The proof is again based 
on the localized characterization given in Proposition 4.8, but we now use 
Corollary 4.16 to bound the term involving Fn-\- [see (52) ahead]. 

Theorem 4.17. Let the conditions of Theorem 4.10 be satisfied. Then 
there exists an r > such that for every e > and Mi > there exist M > 
and ni > such that 

p( sup n^/'^\Fnk{s + n-^/h)-Fok{s)\>M) <e, k = l,...,K, 

\te[~Aii,Mi] / 

for all n> ni and s G [to — t^ + r]. 

Proof. For the reasons discussed in Remark 4.15, it is sufficient to prove 
the result for s = tQ. Let e > 0, Mi > and k G {1, . . . , K}. We want to show 
that there exist constants M > Mi and ni > such that for all n > ni, 

(49) P{Kk(to + Mn-^'^) > Fokito + 2Mn'^'^)) < e, 

(50) PiKkito - M71-1/3) < Fokito - 2M71-1/3)) < e. 
We only prove (49), since the proof of (50) is analogous. Define 

BnkM = {Fnkito + Mn-^/'^) > FokisnAi)} and SnM = to + 2Mn-^/^, 

and let Tnk be the last jump point of P„fc before to + Mn~^/^ . Since we 
may assume that SnM <t() + r < T(„) for n sufficiently large. Proposition 4.8 
yields 

P{BnkM) =P[ {ak{Fnk{u) - Fok{u)] 
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(51) +aK+i{Fn+iu)-Fo+iu)}}dG{u) 

< / dSnk{u) + RnkiTnk,SnM), BnkM 

By consistency of Fnk (Proposition 3.3) and the strict monotonicity of -Fo^ 
in a neighborhood of to, we may assume that Tnk G [^o — ^i^o + Mn~^/^]. 
Moreover, by Proposition 4.8 and Corollary 4.16 we can choose C > such 
that, with high probability, 

\Rnk{rnk,SnM)\ < ^(n-^/S V n~^/3(s„M " T„fc)^^^), 

(52) \Fn+{u) - Fo+{u)\ dG{u) < C{n"^/'^ V n"i/6(s„M - r^kf^^), 



uniformly in T„fc G [to — r,tQ + Mn ^/'^] . Finahy, note that on the event BnkM, 
we have J^^^" {Kkiu) - Fok{u)}dG{u) > /;;;^{Fofc(s„Af) - Fok{u)} dG{u), 
yielding a positive quadratic drift. The statement now follows by combining 
these facts with (51), and applying Lemma 4.13. □ 

Remark 4.18. Note that Theorem 4.10 and Corollary 4.16 yielded the 
bound (52) in the proof of Theorem 4.17. Such a bound would not have been 
possible using rate results like (33) or (34) for Fn-\-. A bound of the form 
(33) cannot be used, since we cannot assume that Tnk — SnM = Op{n~^^^). 
A bound of the form (34) would change the right-hand side of (52) to 
Cn~^/^{Tnk — •SnM)logn, and this is not dominated by the quadratic drift 
(Tnk ~ s)^ for Tnk — s > Mn~^^^ . Even a stronger global bound of the form 
Op(n~^/'^ loglogn) would not suffice for this purpose. This shows that the 
rate result given in Theorem 4.10 was essential for the proof of Theorem 
4.17. 

Corollary 4.19. Let the conditions of Theorem 4.10 be satisfied. For 
all k = I, . . . , K , let t~^{s) and t^i^{s) be, respectively, the largest jump point 
< s and the smallest jump point > s of Fnk- Then there exists an r > 
such that for every e > there exist ni > and G > such that for all 
k = l,...,K, 

Pi^nki^) - ^nfc(s) > Cn-^/"^) <e forn>ni,s£[to- r/2M + r/2\. 

Proof. Let e > and r > 0. Take an arbitrary value for Mi (say Mi = 
1), and choose M and ni according to Theorem 4.17. Next, choose C > 
such that 

(53) Fofc(s - Gn-^'^) + Mn'^'^ < Fok{s) - Mn'^'^ 

for sE [to- r/2, to + r/2]. 
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Note that s-Cn~^/^ G [tQ-r,tQ + r] for all s G [Iq- r /2,tQ + r /2\ and n > ni, 
for ni sufficiently large. Hence, applying Theorem 4.17 to s and s — Cn~^l^ 
yields 

P(F„fe(s - ^ p^^^^ _ Cn~^/3^ + Mn~^/'') >l-e, 

for n> 111. Together with (53) this implies that P{s — t~i^{s) > Cn~^^^) < 2e, 
for n>ni and s e [tQ — r/2,tQ + r/2]. Similar reasoning holds for T^fc(s). □ 

We now obtain a bound for the remainder terms Rnk{s,t) in Proposi- 
tion 4.8, for tQ — mn~^/^ < s < t < to + mn^^^^ and m > 0. This bound is 
used in Proposition 3.2 of [3], which is a recentered and rescaled character- 
ization of the MLE that is needed to prove the limiting distribution. 

Corollary 4.20. Let m > and let Rnk{s,t), k = l,...,K, be the re- 
mainder terms in Proposition 4.8, defined by (28). Then 

(54) sup \Rnkis,t)\=Op{n-'^/^). 

Proof. Since Rnk{s, t) = Eii P^n^K+ii^^ " Eti Pnli^, t), it is suffi- 
cient to show that the terms p^^l{s, t), k = I, . . . , K + I, i = 1, . . . ,A, are of 
the right order, uniformly in to — mn~^^^ < s <t <to + mn~^/^. 

Let m > and A; G {1, . . . , K + 1}. We first consider p^^^, defined by (23). 
By the local rate of convergence (Theorem 4.17) and the continuous differ- 
entiability of Fok at to, we have F„fc(u) — -Fofc(^) = Op(n~^/^), uniformly in 
u G [to — mn~^/^,to + mn~^/^]. Moreover, the assumption -Fofc(to) > 0, the 
consistency of Fnk (Proposition 3.3), and the continuity of Fok at to, imply 
that {Fok{u)Fnk{u)}~^ =Op(l), uniformly in n G [to — mn~^/^,to + mn~^^^]. 
Hence, 

|p2(s,t)|<Op(n-2/3) /■ dKfc(^) = Op(n-i), 

J[to—mn '-'■^,to+mn '-/•') 

uniformly in to ~ mn"^/^ < s < t < to + mn"^^^. 

(2) 

Next, we consider p)^^ , defined by (25). We apply Theorem 2.11.22 of [17] 
to the class Q„, where 

n i / \ ^Fn{u) - Fok{u) ^ \ i z? ^ -r 1 

Qn = |9n,F„,t(u) = Vn -^r^(^ l[to,to+n-i/3f)(u) :t G [-m,m\,Fn G 

.F„=|f„:M^[0,1], 

Fn monotone, sup |(F„ - Fofc)(to + n~'^/^u)\ < Cn"^/^ I. 
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This yields that the sequence {\/n{Vnk — Vk)Qn,Fn,t ■ t G [— "i, m], F„ G is 
tight. Moreover, for every e > we can choose C > and ni > such that 
P{Fnk S J^n) > 1 — e for ah n > ni, by the local rate of convergence of Fnk 
(Theorem 4.17) and the continuous differentiability of F^k at Iq. This implies 
that /0^fc^(s,t) = Op{n~^), uniformly in — mn~^/^ < s <t <tQ + mn~^/^, 
since 

Vn{Vnk - Vk)q^ , = n „ , diVnk -Vk)(u). 

.(3) J4) 



Finally, we consider the terms and p^^ , defined by (26) and (27). We 



showed in the proof of Proposition 4.8 that P^li^^^) ~ Op{n ^^^{t — s)^/^) 



and p[^l{s,t) = Op{n ^/^(i — s)), uniformly in Iq — t < s <t< to + r. Plugging 
in t — s < 2mn"^/3 completes the proof. □ 

5. Technical proofs. 

Proof of Lemma 4.13. Let k£ {I,... ,K}, n > and j G {0, . . . , J„}. 
Note that for M large, we have for all w < tnj+i- 

C{n~^/^ V n~'/%SnjM - wf") < \b{Sn,M " wf, 

since SnjM — w > (M — l)n~^/^. Hence, the probability in the statement of 
Lemma 4.13 is bounded above by 

(55) P\ sup (/ dSnk{u)-^b{SnjM-wf]>0]. 

^we[to-2r,tn,j+i] w[ui,s„jA/) ) J 

In order to bound this probability, we put a grid on the interval [to — 
2r,tnj+i), with grid points tn,j-q and grid cells In,j-q given by 

(56) 

= [to + (i - q)n-^'\ to + {j-q+ l)n-^h, 
for q = 0, . . . , Qnj = \2rn^^'^ + j] . Then (55) is bounded above by 

(57) sup / dSnk{u)>^b{SnjM-tn,j-q+lf\- 

If we bound the qth term in (57) by 



(58) PjqM 



r exp{-d2{q + Mf}, if j = 0, g = 0,...,Q„o, 

lexp{-d2(Q + M/)3}, if j = l,...,J„ g = 0,...,Q, 
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for some d2 > 0, then we are done, since summing over q and using (a + 6)^ > 
+ for a, 6 > 0, and defining di = J2'^o exp(— (i2(?^) < oo, yields 

(diexp{-d2M^}, ifi = 0, 

P'^' - I di exp{-d2{Mfy}, if i = 1, . . . , Jn. 

In order to prove that such a bound holds, we introduce, for each 9 > 0, 
the time-reversed submartingale exp{n6' /[^^^^.j,^) dSnk{u)}, for w < SnjM, 

with respect to the filtration {J^w - w <to + r}, where J^^ = fiC^i, A* ),i = 
1, . . . ,n : Tj > w}. Then, by Doob's submartingale inequality (see, e.g., [12], 
Theorem 70.1, page 177), the qth. term in (57) is, for each 6 > 0, bounded 
above by 

P< sup exp<^ nO / dSnk{u) \ > exp{^neb{SnjM - tnj-q+lf} \ 

(59) < ex.p{-^neb{snjM - tn,j-q+if}Eexpl nO / dSnk{u) >■ 

We are now left with computing an upper bound for E exp{n9 Jj^^^ , ^ ^ ^ dSnk (u) } ■ 
Since we have i.i.d. observations, this expectation can be written as 

(60) (i?exp{ei[,„^^_^,,„^^^)(r)Cnfc(r,A)})" 



where Cnk{T,A) 



Fok{T) Fo,K+i{T)' 
Using the exponential series and E{(nk{T, A)\T) = 0, (60) equals 

expjnlogj^l + £;i[t„ ._^,,„.„)(T) }^ j |, 

and since log(l + x) < x for all x > — 1, this is bounded above by 

exp{in/„(6',t„,j„g, SnjAl) 0'^{SnjM — tn,j-q)} 

(61) 

where /„(0,ci,C2) = ^ -p / \E{Cnk{T,AY\T = t}\dG{t). 

Next, for each pair ci < C2, we let 0ci,c2 be the solution of the equation 
Ofn{0,ci,C2) = \b{c2 — ci). This solution exists and is unique for all ci < 
C2, since t-^ 6fn{0,ci,C2) is a continuous increasing map from onto 
M+. Choosing 6* = 6't„,,„^,s„jM in (61), and using that {snjM - tn,j-qf < 
"^{snjM — tn,j-q+i)^ for all j and q and M > 4, and that s^jm — tn,j-q > 
yields that (59) is bounded above by 

(62) cxp [ '^^'^^^njM - tn,j-q+lf \ f nb"^ {Snj M - tn,j^q+lf 



'^Gfn{Ot„^^_g,s„,,^l,tn,j-q,SnjM) } I 16d 
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where d = supjp„2r<ci<c2<to+2r/n(^'ci,c2,ci,C2). Here we use that, for n suf- 
ficiently large, all intervals [tn,j-q, SnjM) are contained in the interval [to — 
2r, to + 2r] . Note that d <oo since 

< i&(c2 -ci)7 r \E{U{T,A)''\T = t}\dG{t). 

Hence, there is a constant d2> such that for all q = 0, . . . , Qnj, the right- 
hand side of (62) is bounded above by exp{—d2{q + M)^} for j = 0, and by 
exp{-d2{q + Mff} for j = 1, . . . , J„. □ 

Proof of Lemma 4.14. We first note that i is only defined on the 
event AnjM = {Fn+itn,j+i) > i^o+(snjA/)}- Hencc, this entire proof should 
be read on the event A^jM. Furthermore, note that the lemma is trivial if 
£ = K, because in that case Fn+{u) > Fo^{snjM) for all u > Tnij- Therefore, 
suppose £ < K. Then we typically do not have that Fn+{u) > FQ+{snjM) 
for all u > Tnij, since F„+(n) may have jumps on (r„^j, We now 
exploit the ET-dimensional system of sub-distribution functions by breaking 
J^"^'^ {Fn+{u) — Fq^{u)} dG{u) into pieces that we analyze separately. First, 

we define t ^ {£,..., K} as follows. If 

(63) / {Fn+{u) - Fq+{u)} dG{u) <Q for ah A: = £ + 1, . . . , i^, 

we let i* = i. Otherwise we define £* such that 

(64) r'"{Fn+{u)-Fo+{u)}dG{u)<0, k = £* + 1, . . . , K, 

(65) r''\Fn+{u)-Fo+{u)}dG{u) > 0. 

Then, by (65) and the decomposition J^^^'^'^' = J^^^*^ + J^"^^^' , we get 



{Fn+iu)-Fo+{u)}dG{u) 



'ntj 



(66) 

> / {Fn+(u)-Fo+{u)]dG{u), 

where strict inequality holds \i£^ £* . By rearranging the sum and using the 
notation Tn^K+i,j = SnjM, we can write the right-hand side of (66) as 

E r'\Fnk{u) - Fofc(n)} dG{u) 

(67) 



+ E E r"'**'"{Ap(")-fop(«)}<iO(")- 

I- /)* 1 'J T^L'-i 



k=i* p=l'^^nf^^j 
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We now derive lower bounds for both terms in (67), on the event AnjM H 
EnrC- Starting with the first term, note that 

{Fn+{u) - Fo+(n)} dG{u) < 0, k = t + l,...,K. 

Namely, if ^ = £*, then (68) is the same as (63). On the other hand, if 
i < I* ^ then (68) follows (with strict inequality) from (64), (65) and the 
decomposition /jT"'"-' = /^"^*^ + 11"^^* ■ ■ Furthermore, Proposition 4.8 implies 
that on the event EnrC-, 

/ {ak{Fnk{u) - F^kiu)} + aK+i{Fn+{u) - Fo+(n)}} dG{u) 
Jt 

(69) 

> / dSnkiu) - C(n-2/3 V n-^/3(r„fc, - tf^), 

for k = 1, . . . , K and t < r^kj-, where Snk is defined in (18). Using this in- 
equality with t = Tni*j together with (68) yields that on the event EnrC^ 

nkj ^ 

ak{Fnk{u) - Fokiu)} dG{u) 

'''nl*] 



> 



['''nl* j ''''nkj) 



for k = £* + 1,. . . ,K, so that the first term of (67) is bounded below by 

E dSnk{u)-C{n-^/'vn-'/''{Tnkj-rni*jf^^)]. 

We now derive a lower bound for the second term of (67). Note that the 
inequalities (44) in the definition of £ imply that on the event AnjM 

K K 

X! Fnp{tn,j+l)< X! ^Op(SnjM), k = £,..., K. 

p=k+l p=k+l 

Together with the definition of r„,ij , . . . , TnXj , this yields that on the event 
AnjM = {Fn+itn,j+i) > Fo+{snjM)}, we have 

k k 

FnpjTnpj) = E] Fnp{tnJ+l) 
p=l p=l 

k 

>^Fop{snjM), k = e,...,K. 

p=l 



CURRENT STATUS COMPETING RISKS DATA (I) 33 

Furthermore, Fnp{Tnpj) < Fnp{Tnkj) for p < /c by the monotonicity of Fnp and 
the ordering Tnij < • • • < TnKj- Hence, we get for k = £,•■•, K and u > r^kj'- 

k k 

^ ^ Fnpi'U') ^ ^ ^ Fnpij'nkj) 
p=l p=l 

p=l p=l 

This imphes that the second term of (67) is bounded below by 

K k 



EE/ {FopisnjAi) - Fopiu)} dG{u) 

k=i* p=l ■''^r.kj 
K 

= 2^ {Fok{SnjM) - Fok{u)} dG{u). 



Hence, 



P\ / {F„+(n) - Fo+(n)} dG{u) < 0, £^nr-c 

-C(n-2/3vn-V3(^„,^._^„,,^.)3/2) 

JL /-SnjA/ I 

+ E {^ofc( 



The statement now follows by writing 



dSnk{u) = I dSnk{u) - / dSnk{u) 

and several applications of Lemma 4.13. □ 
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